Skip to content

dynamic mcp: harden client catalog discovery#772

Merged
nickmisasi merged 8 commits into
masterfrom
dynamic-mcp-02-client-catalog
Jun 10, 2026
Merged

dynamic mcp: harden client catalog discovery#772
nickmisasi merged 8 commits into
masterfrom
dynamic-mcp-02-client-catalog

Conversation

@nickmisasi

@nickmisasi nickmisasi commented May 23, 2026

Copy link
Copy Markdown
Collaborator

Summary

Hardens MCP client catalog discovery with safer tool identity handling (duplicate registrations are no longer silently dropped), plugin/embedded discovery refresh behavior, user-client synchronization, and retrieval override plumbing, with backwards-compatible tool policy lookup for configs without namespace plumbing. Also adds bare MCP tool name resolution, LLM context scaffolding for dynamic loading (request context, registry stash, telemetry hooks, preloaded tools), threads request context through conversation and API entry points, fixes channel analysis tool discoverability under narrow allowlists, and removes legacy log-based tool-trace plumbing superseded by OpenTelemetry.

Stack: 2/6

Ticket Link

N/A

Screenshots

N/A

Release Note

Improved MCP tool catalog discovery, bare tool name resolution, and channel analysis tool availability for dynamic loading.

Made with Cursor

Summary by CodeRabbit

  • New Features

    • Runtime tracking and telemetry for dynamic MCP tool search/load and per-conversation IDs.
    • Namespacing and bare-name helpers plus retrieval-override keys to disambiguate tools across servers.
    • Per-user plugin/embedded MCP clients with optional shared tools cache.
  • Improvements

    • Request-context propagation across tool resolution and MCP flows for cancellation/timeouts.
    • Thread-safe client/tool caching, normalized server-origin handling and safer reconnection/refresh behavior.
  • Tests

    • Expanded unit, integration and e2e coverage for namespaced tools, caching, discovery and telemetry.

@github-actions

github-actions Bot commented May 23, 2026

Copy link
Copy Markdown

🤖 LLM Evaluation Results

OpenAI

⚠️ Overall: 21/28 tests passed (75.0%)

Provider Total Passed Failed Pass Rate
⚠️ OPENAI 28 21 7 75.0%

❌ Failed Evaluations

Show 7 failures

OPENAI

1. TestReactEval/[openai]_react_cat_message

  • Score: 0.00
  • Rubric: The word/emoji is a cat emoji or a heart/love emoji
  • Reason: The output is the text "smiley_cat", which is neither a cat emoji (e.g., 🐱) nor a heart/love emoji (e.g., ❤️).

2. TestConversationMentionHandling/[openai]_conversation_from_attribution_long_thread.json

  • Score: 0.00
  • Rubric: is a list of bugs
  • Reason: The output does not actually list any bugs; it states it cannot enumerate bugs and provides a template requesting the bug reports. Therefore it is not a list of bugs.

3. TestConversationMentionHandling/[openai]_conversation_from_attribution_long_thread.json

  • Score: 0.00
  • Rubric: includes a description of each bug
  • Reason: The output does not actually describe any specific bugs; it states inability to access reports and provides a template. Since no bugs are enumerated, it cannot include a description of each bug.

4. TestConversationMentionHandling/[openai]_conversation_from_attribution_long_thread.json

  • Score: 0.00
  • Rubric: attributes each bug to a user
  • Reason: The output does not list any bugs at all, and therefore does not attribute each bug to a user (it only offers a template and asks for bug reports).

5. TestConversationMentionHandling/[openai]_conversation_from_attribution_long_thread.json

  • Score: 0.00
  • Rubric: attributes the bug about trying to save without a color and the save button not doing anything to @maria.nunez
  • Reason: The output does not mention the specific bug (saving without a color / save button not doing anything) nor does it attribute it to @maria.nunez. It only asks for bug reports to be pasted and provides a template.

6. TestConversationMentionHandling/[openai]_conversation_from_attribution_long_thread.json

  • Score: 0.00
  • Rubric: the bug about the end user being able to change channel banner is attributed to @maria.nunez
  • Reason: The output does not mention the specific bug about an end user being able to change the channel banner, nor does it attribute that bug to @maria.nunez. It only asks for bug reports and provides a template.

7. TestDirectMessageConversations/[openai]_bot_dm_tool_introspection

  • Score: 0.50
  • Rubric: mentions Github and refers to the documentation
  • Reason: The output refers to documentation (Mattermost documentation link) but does not mention GitHub anywhere, so it does not satisfy the requirement to mention GitHub and refer to the documentation.

Anthropic

⚠️ Overall: 21/28 tests passed (75.0%)

Provider Total Passed Failed Pass Rate
⚠️ ANTHROPIC 28 21 7 75.0%

❌ Failed Evaluations

Show 7 failures

ANTHROPIC

1. TestReactEval/[anthropic]_react_cat_message

  • Score: 0.00
  • Rubric: The word/emoji is a cat emoji or a heart/love emoji
  • Reason: The output is the text token "heart_eyes_cat" rather than an actual cat emoji (e.g., 😺) or a heart/love emoji (e.g., ❤️).

2. TestConversationMentionHandling/[anthropic]_conversation_from_attribution_long_thread.json

  • Score: 0.00
  • Rubric: is a list of bugs
  • Reason: The output does not provide any bugs; it states the assistant cannot access trackers and suggests where to look, offering to organize bugs if pasted. This is not a list of bugs.

3. TestConversationMentionHandling/[anthropic]_conversation_from_attribution_long_thread.json

  • Score: 0.00
  • Rubric: includes a description of each bug
  • Reason: The output does not describe any specific bugs; it only explains lack of access and suggests ways to find bug reports. Therefore it does not include a description of each bug.

4. TestConversationMentionHandling/[anthropic]_conversation_from_attribution_long_thread.json

  • Score: 0.00
  • Rubric: attributes each bug to a user
  • Reason: The output does not attribute any bugs to specific users/reporters; it only suggests places to find bug reports and offers to organize them if provided.

5. TestConversationMentionHandling/[anthropic]_conversation_from_attribution_long_thread.json

  • Score: 0.00
  • Rubric: attributes the bug about trying to save without a color and the save button not doing anything to @maria.nunez
  • Reason: The output does not mention the specific bug (saving without a color and the save button not doing anything) nor attribute it to @maria.nunez. It only states lack of access and suggests ways to compile a list.

6. TestConversationMentionHandling/[anthropic]_conversation_from_attribution_long_thread.json

  • Score: 0.00
  • Rubric: the bug about the end user being able to change channel banner is attributed to @maria.nunez
  • Reason: The output does not mention the specific bug about an end user being able to change the channel banner, nor does it attribute that bug to @maria.nunez. It only states lack of access and suggests ways to compile a list.

7. TestDirectMessageConversations/[anthropic]_bot_dm_tool_introspection

  • Score: 0.00
  • Rubric: mentions Github and refers to the documentation
  • Reason: The output refers to documentation (docs.mattermost.com) but does not mention GitHub anywhere, so it fails the rubric requirement to both mention GitHub and refer to the documentation.

This comment was automatically generated by the eval CI pipeline.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: dbdf599280

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread mcp/user_clients.go
@nickmisasi

Copy link
Copy Markdown
Collaborator Author

@CodeRabbit review

@coderabbitai

coderabbitai Bot commented May 23, 2026

Copy link
Copy Markdown
✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@coderabbitai

coderabbitai Bot commented May 23, 2026

Copy link
Copy Markdown

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository UI (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: 9ef93215-3dcc-4870-acd3-b2e400ce10ed

📥 Commits

Reviewing files that changed from the base of the PR and between 272ed61 and 209647b.

📒 Files selected for processing (2)
  • api/api_conversation_context.go
  • bifrost/bifrost_test.go
✅ Files skipped from review due to trivial changes (1)
  • bifrost/bifrost_test.go

📝 Walkthrough

Walkthrough

Adds request-scoped LLM tool runtime state and MCP dynamic tool telemetry; implements MCP tool namespacing/allowlist helpers; refactors tool resolution/store; makes MCP clients, user-clients, and manager context-aware with concurrency-safe caches and slug-based namespacing; and adds tests and retrieval-override metadata.

Changes

MCP Dynamic Tool Loading and Telemetry

Layer / File(s) Summary
LLM context and MCP dynamic tool telemetry
llm/context.go, llm/context_test.go
Context gains ConversationID and ToolRuntime with ToolRuntimeContext (MCP dynamic loading flags, telemetry interface, search/load tracking maps, restorer callback). Methods added to observe events, mark search/load usage, restore dynamically materialized tools, and decide one-time-per-tool success recording; tests exercise bot-name fallback and state transitions.

Tool Store Enhancements and MCP Tool Namespacing

Layer / File(s) Summary
ToolCall schema and ToolStore lookup/unloaded tracking
llm/tools.go, llm/tools_test.go
ToolResolver now accepts context.Context; ToolCall adds Schema and MCPBareName; ToolStore tracks unloadedMCPTools, introduces a nil-safe internal lookup supporting bare MCP names, and clears unloaded entries when tools are re-added; tests updated/added for resolution, unloaded lifecycle, and origin normalization.
MCP name helpers and allowlist filtering
llm/tools.go, llm/tools_test.go
Adds MCPToolNameSeparator and helpers: NamespaceMCPToolName, BareMCPToolName, IsBareMCPToolName, MCPToolNameMatches; adds FilterMCPToolsByAllowlist and updates RetainOnlyMCPTools to use new allowlist logic; tests cover per-origin and wildcard allowlist behaviour.

MCP Client Caching, Invalidation, and Tool Discovery

Layer / File(s) Summary
Client types and shared listing
mcp/client.go, mcp/client_embedded_oauth_test.go
Adds toolsMu and optional shared toolsCache to clients; introduces NewEmbeddedServerClientWithCache, listAllTools(ctx, session), and defensive Tools() snapshot; tests for embedded discovery, pagination, and cache handling.
NewPluginClient and reconnect handling
mcp/client.go, mcp/client_test.go
Adds NewPluginClient for per-user plugin bridge clients; on reconnect embedded clients refresh from reconnected client's Tools() snapshot while remote reconnect rediscovers tools via listAllTools and updates shared cache; refactors response assembly to use strings.Builder.
Tool invalidation and concurrency
mcp/client.go, mcp/tools_cache_test.go
Implements cache invalidation and notification coordination for discovered tools, defensive copies, and a test ensuring InvalidateServer is a no-op when key missing.

MCP Client Manager and User Client Integration

Layer / File(s) Summary
Manager context propagation and overrides
mcp/client_manager.go, mcp/retrieval_overrides.go, mcp/client_manager_test.go
GetToolsForUser now accepts ctx context.Context; manager uses context-aware client creation/connection, aggregates per-request plugin connection errors, trims/skips empty retrieval override summaries, and exposes retrieval-overrides via new ToolRetrievalOverride helper; tests added for override generation and request-scoped error behaviour.
UserClients concurrency, slugging, and resolvers
mcp/user_clients.go, mcp/user_clients_test.go
UserClients becomes context-aware (ConnectToRemoteServers, ConnectToEmbeddedServerIfAvailable, GetTools(ctx)); adds clientsMu, snapshot helpers, per-server slug generation with deconfliction, and createToolResolver that passes request ctx through to tool calls; tests cover deduplication, resolver cancellation, slug collision, and embedded idempotent connect.

llmcontext and API plumbing

Layer / File(s) Summary
Context-builder and API wiring
llmcontext/llm_context.go, api/*, conversations/*, telemetry/*
Threads context.Context through MCP tool retrieval: MCPToolProvider.GetToolsForUser(ctx, userID) and builder methods WithLLMContextTools(ctx, bot) / WithLLMContextDefaultTools(ctx, bot); updates API handlers and many call sites/tests to pass request contexts and updated resolver signatures.

Tools, web search, prompts, and tests updates

Layer / File(s) Summary
Tool resolver signature and call sites
mmtools/web_search.go, prompts/*, channels/*, conversations/*, search/search_test.go, telemetry/integration_test.go
Propagates context.Context into tool resolvers and provider calls; updates tests and call sites to match new resolver signature and to use request contexts for cancellable operations.
E2E and JS test updates
e2e/tests/tool-config/*
Updates runtime tool-name expectations in Playwright specs to use namespaced/runtime formats (e.g., mattermost__read_post).

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested labels

Setup Cloud Test Server

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch dynamic-mcp-02-client-catalog

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
mcp/client.go (1)

163-164: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Guard against nil pluginAPI before session validation.

On Line 164, c.pluginAPI is dereferenced when sessionID != "". If this client is constructed without a plugin API, this path panics instead of returning a typed error.

Suggested fix
 func (c *EmbeddedServerClient) CreateClient(ctx context.Context, userID, sessionID string) (*Client, error) {
 	// Validate session exists before creating transport (unless empty for tool discovery)
 	if sessionID != "" {
+		if c.pluginAPI == nil {
+			return nil, fmt.Errorf("plugin API is required when sessionID is provided")
+		}
 		mmSession, err := c.pluginAPI.Session.Get(sessionID)
 		if err != nil {
 			return nil, fmt.Errorf("failed to get session: %w", err)
 		}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@mcp/client.go` around lines 163 - 164, The code dereferences c.pluginAPI when
sessionID != "" (calling c.pluginAPI.Session.Get) which will panic if
c.pluginAPI is nil; add a nil check for c.pluginAPI before calling Session.Get
and return a typed error (e.g., ErrPluginAPIUnavailable or a wrapped error) when
it's nil, then only call c.pluginAPI.Session.Get when c.pluginAPI is non-nil and
proceed to validate mmSession as before.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@llm/tools.go`:
- Around line 665-668: The function toolArgsForLog should guard the argsGetter
before calling it: check if argsGetter == nil and return a safe string like "no
args getter" (or similar) instead of invoking it, then proceed to call
argsGetter(&raw) and handle its error as before; update the toolArgsForLog
function to perform this nil check (referencing the argsGetter parameter and the
toolArgsForLog function) so logging paths that pass a nil getter do not panic.

In `@mcp/client_manager.go`:
- Around line 205-213: The code currently persists transient connect errors by
calling userClient.appendInitialRemoteConnectError(connectErr) after
ConnectToPluginServer fails; instead, make plugin connect errors request-scoped
or cleared on successful connect: remove or avoid appending to the cached client
on transient failure and either collect connectErr in a local slice for this
request (return it with the response) or ensure you clear per-plugin entries on
subsequent successful ConnectToPluginServer calls (e.g., call a
clearInitialRemoteConnectError-like operation after a successful connect).
Update the logic around userClient.ConnectToPluginServer,
userClient.appendInitialRemoteConnectError and
userClient.InitialRemoteConnectErrors so the cached UserClient no longer
permanently stores transient plugin connect failures.
- Around line 184-188: Change GetToolsForUser to accept ctx context.Context as
its first parameter (i.e., func (m *ClientManager) GetToolsForUser(ctx
context.Context, userID string) ...) and remove the context.Background() call;
propagate this ctx into the downstream call to m.getClientForUser(ctx, userID)
and any other internal calls within GetToolsForUser so caller
cancellation/deadlines flow through the LLM entry path.

In `@mcp/user_clients.go`:
- Around line 371-377: The code currently creates a fallback callCtx :=
context.Background() before invoking client.CallToolWithMetadata which allows
tool calls to outlive request cancellation; instead, require a request-scoped
context: remove the context.Background() fallback, validate that llmContext and
llmContext.RequestContext are non-nil (or ensure a context is passed into the
resolver closure) and return an error immediately if missing, then call
client.CallToolWithMetadata using the verified llmContext.RequestContext along
with toolName, args and metadata.

---

Outside diff comments:
In `@mcp/client.go`:
- Around line 163-164: The code dereferences c.pluginAPI when sessionID != ""
(calling c.pluginAPI.Session.Get) which will panic if c.pluginAPI is nil; add a
nil check for c.pluginAPI before calling Session.Get and return a typed error
(e.g., ErrPluginAPIUnavailable or a wrapped error) when it's nil, then only call
c.pluginAPI.Session.Get when c.pluginAPI is non-nil and proceed to validate
mmSession as before.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: 2d34a450-b143-49b9-abb0-256fdd5cd29c

📥 Commits

Reviewing files that changed from the base of the PR and between c65e350 and dbdf599.

📒 Files selected for processing (17)
  • llm/context.go
  • llm/context_test.go
  • llm/tools.go
  • llm/tools_test.go
  • mcp/client.go
  • mcp/client_embedded_oauth_test.go
  • mcp/client_manager.go
  • mcp/client_manager_filter_test.go
  • mcp/client_manager_test.go
  • mcp/client_test.go
  • mcp/retrieval_overrides.go
  • mcp/tools_cache_test.go
  • mcp/user_clients.go
  • mcp/user_clients_test.go
  • mcpserver/eval_helpers_test.go
  • search/search_test.go
  • telemetry/integration_test.go

Comment thread llm/tools.go Outdated
Comment thread mcp/client_manager.go Outdated
Comment thread mcp/client_manager.go Outdated
Comment thread mcp/user_clients.go Outdated
@nickmisasi nickmisasi force-pushed the dynamic-mcp-02-client-catalog branch from dbdf599 to eb5e85f Compare May 23, 2026 02:52

@cursor cursor Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stale comment

Security Review: No Actionable Findings

Reviewed the full diff (3014 additions, 424 deletions across 38 files) for this MCP client catalog hardening PR. The changes are security-positive overall.

Key areas examined:

  • channelAnalysisToolBot tool expansion (api/api_channel.go): Overrides the bot's MCP tool allowlist for channel analysis. Initially appeared risky since channel analysis auto-approves all tool calls, but the downstream AnalyzeChannel function correctly scopes the tool set to only read_channel and get_channel_info with bound channel_id parameters before the LLM sees them. The expanded catalog is only used to locate these two tools.

  • Context propagation: context.Background() removed from production MCP tool paths. GetToolsForUser, createAndStoreUserClient, ConnectToRemoteServers, and ConnectToEmbeddedServerIfAvailable now accept caller-scoped context.Context. Tool resolvers require RequestContext and fail fast if missing.

  • Tool namespacing and bare-name fallback (llm/tools.go): lookupTool bare-name resolution correctly returns ambiguous results as not-found. mcpToolAllowed scopes allowlist checks by ServerOrigin, preventing cross-server tool confusion.

  • Concurrency safety: New mutex protection (toolsMu, clientsMu, discoveryMu, notifyOwnerMu) on Client and UserClients prevents race conditions during tool list invalidation and rediscovery.

  • Plugin MCP client (mcp/client.go): NewPluginClient correctly injects authenticated user ID via headerTransport from session-derived data, not user input. Tool list changed handlers properly invalidate caches.

  • Trace logging (llm/tools.go): TraceResolved logs full tool results when doTrace is true, but this flag is not enabled in any production code path.

No medium, high, or critical vulnerabilities identified.

Open in Web View Automation 

Sent by Cursor Automation: Find vulnerabilities

@nickmisasi nickmisasi requested a review from crspeller May 25, 2026 14:34
@nickmisasi nickmisasi force-pushed the dynamic-mcp-02-client-catalog branch from 6986e50 to 99205c9 Compare May 26, 2026 18:09

@cursor cursor Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stale comment

Security Review: No Actionable Findings

Reviewed the full diff for this MCP client catalog hardening PR. The changes are security-positive overall.

Key areas examined:

  • channelAnalysisToolBot tool expansion (api/api_channel.go): Overrides the bot's MCP tool allowlist (AutoEnableNewMCPTools = true, EnabledMCPTools = nil). Confirmed safe: channels/channels.go:95-111 immediately scopes to a new scopedTools store containing only read_channel and get_channel_info with bound channel_id parameters before the LLM executes.

  • lookupTool bare-name fallback (llm/tools.go): When the LLM calls a tool by bare name, the store searches all namespaced tools. Correctly returns not-found when ambiguous (multiple servers share the same bare tool name). The resolver always executes within the requesting user's permission scope.

  • ToolPolicyLookupName and policy bypass (mcp/tool_policy.go): The production ToolPolicyFunc wraps LookupToolPolicy, which calls ToolPolicyLookupName to strip namespace prefixes before matching admin-configured policies. Namespaced runtime names (jira__get_issue) correctly match bare policies (get_issue). No bypass path exists.

  • Context propagation: context.Background()/context.TODO() removed from MCP tool paths. Tool resolvers now require RequestContext and fail fast with "missing request context for MCP tool call" if nil. This is a security improvement.

  • Plugin MCP client (mcp/client.go:NewPluginClient): MMUserIDHeader is injected from c.userID (sourced from the authenticated session, not user input). The sourcePluginAPI nil check is preserved.

  • Concurrency safety: New mutex protection (toolsMu, clientsMu, discoveryMu) prevents race conditions during ToolListChanged notifications and rediscovery. createAndStoreUserClient uses a proper double-check-after-lock pattern.

  • Trace logging (llm/tools.go): TraceResolved logs full tool results, but only when doTrace is true. The production NewToolStore() call (at llmcontext/llm_context.go:191) passes no options, leaving doTrace as false.

No medium, high, or critical vulnerabilities identified.

Open in Web View Automation 

Sent by Cursor Automation: Find vulnerabilities

@nickmisasi nickmisasi force-pushed the dynamic-mcp-02-client-catalog branch from 99205c9 to e9e6f79 Compare May 27, 2026 13:36

@cursor cursor Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stale comment

Security Review — No Findings

Reviewed the full diff (2840 additions across 36 files) for vulnerabilities in channel data isolation, user identity handling, tool/MCP boundaries, prompt injection surface, file access control, search scoping, and secrets exposure.

Security-positive changes in this PR:

  • Request context propagation: GetToolsForUser and tool resolvers now require context.Context from the authenticated request, replacing prior context.Background()/context.TODO() calls. This ensures MCP tool calls respect request-scoped cancellation and cannot outlive the originating session.
  • Tool namespacing: MCP tools are now namespaced by server slug (e.g., jira__search instead of search), eliminating cross-server tool name collisions. The lookupTool bare-name fallback correctly rejects ambiguous matches.
  • Concurrency safety: Client.tools access is now mutex-protected, and UserClients operations use proper RLock/Lock patterns with snapshot-then-release to avoid holding locks during network calls.
  • Tool resolver hardening: The resolver now fails fast with "missing request context for MCP tool call" when RequestContext is nil, enforcing the coding guideline against context.Background() in production paths.
  • Request-scoped error handling: Plugin connection errors are no longer persisted on the cached UserClients, preventing transient failures from leaking across requests.

Areas examined with no issues found:

  • channelAnalysisToolBot: Overrides the bot's AutoEnableNewMCPTools/EnabledMCPTools to load the full MCP catalog. While this expands available tools beyond the bot's configured allowlist, the requesting user's identity and permissions still scope all tool execution via RequestContext. The admin-level filterToolsByConfig disabled-tool filtering still applies. This is a policy choice, not a permission bypass.
  • NewPluginClient: User identity is correctly baked into the HTTP transport via MMUserIDHeader from the session-authenticated userID, not from user-supplied input.
  • ToolPolicyLookupName / mcpToolAllowed bare-name fallbacks: Always scoped by ServerOrigin, so cross-server policy confusion is not possible.
  • No hardcoded credentials, secrets in logs, or path traversal vectors introduced.
Open in Web View Automation 

Sent by Cursor Automation: Find vulnerabilities

Comment thread llm/context.go Outdated
Comment on lines +59 to +89
// MCPDynamicToolLoading indicates this context uses strict MCP dynamic loading.
MCPDynamicToolLoading bool
// MCPDynamicToolTelemetry receives low-cardinality dynamic MCP tool events.
MCPDynamicToolTelemetry MCPDynamicToolTelemetry
MCPDynamicToolSearchUsed bool
MCPDynamicLoadedToolNames map[string]bool
MCPDynamicSearchLoadCallSuccessRecorded map[string]bool

// DisabledMCPServerOrigins contains per-user disabled MCP server origins that
// must be removed before strict registry construction.
DisabledMCPServerOrigins []string

// KeepMCPTool, when non-nil, is applied to MCP tools before strict registry
// construction and before flag-off visible MCP insertion.
KeepMCPTool func(Tool) bool

// PreloadedMCPTools contains exact-or-bare MCP tool selectors for internal
// predefined flows. They are selected only from the already-authorized MCP
// catalog and are request scoped.
PreloadedMCPTools []EnabledMCPTool

// MCPToolRegistry holds the strict MCP tool registry that was built
// alongside Tools, when MCP dynamic tool loading is enabled. It is stashed
// here so callers can replay loaded-tool restoration after the conversation
// row exists without rebuilding the entire tool store.
//
// Stored as `any` to avoid an llm -> mcp import cycle: the mcp package
// already imports llm, and the only consumer that needs the concrete type
// is the llmcontext package, which can import both. Type-assert to
// *mcp.ToolRegistry there.
MCPToolRegistry any

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These don't belong here. This struct is for prompt building not passing stuff like this.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I cleaned this up by moving the dynamic MCP/tool execution fields into a nested ToolRuntimeContext on llm.Context, so they’re no longer loose prompt-building fields.

I think this should stay reachable from llm.Context because Context is already the per-turn carrier used by tool construction and execution (Tools already lives there), not just rendered prompt text. The dynamic MCP state needs to be available across the same flow: building the tool store, resolving meta-tools, recording per-turn load/search telemetry, and restoring loaded tools. Splitting it into a separate top-level parameter would force the same runtime object through all resolver/toolrunner/context-builder paths without changing the ownership model. If you're passing it all around to the same places alongside the Context it's looking like a shared tool context anyways.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok makes sense. I think this could use a refactor to make that clear, but out of scope here.

Comment thread llm/context.go Outdated

// RequestContext carries the caller's request-scoped context for downstream
// work such as MCP tool discovery. May be nil in tests.
RequestContext stdcontext.Context

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't belong here, we should update to pass context as first parameter if needed.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You thumbs upped but it is still there.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be good now. Did refactor the first time but missed a few spots

Comment thread llm/context.go Outdated
// already imports llm, and the only consumer that needs the concrete type
// is the llmcontext package, which can import both. Type-assert to
// *mcp.ToolRegistry there.
MCPToolRegistry any

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Specifically for this one I don't think we want any

Comment thread mcp/user_clients.go
if client != nil {
candidates = append(candidates, client.config.Name)
}
candidates = append(candidates, serverID)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure the opeque serverID makes sense over the base URL. These are LLM visible so ideally they are not something strange.

There is also the reported name from the MCP server we could use.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remote servers: require/configure a name through system console; serverID is that name anyway.
Plugin MCP with a configured name: client.config.Name is the plugin-provided name, so that wins.
Plugin MCP without a configured name: client.config.Name is empty, then fallback uses serverID, which is plugin://.
For that fallback case, serverID and BaseURL are basically the same synthetic origin, so there isn't much of a difference in ordering. I preserved both fallbacks more for safety than anything.

Comment thread llm/tools.go Outdated
func (s *ToolStore) AddTools(tools []Tool) {
for _, tool := range tools {
s.tools[tool.Name] = tool
if s.unloadedMCPTools != nil {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unnecessary check.

Comment thread llm/tools.go
tools map[string]Tool
authErrors []ToolAuthError
tools map[string]Tool
unloadedMCPTools map[string]ToolInfo

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems more complicated than nessisary? Why have toolinfo? Also why is it unloaded MCP tools and not what is loaded?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unloadedMCPTools are tools we know to exist in the catalog (ie, from a search) but haven't been loaded yet. ToolInfo is a lightweight struct to reflect what we actually know about it at this point. I've added a comment to clarify

Comment thread mcp/client.go Outdated
toolsDirty bool
toolsGeneration uint64
notifyOwnerMu sync.RWMutex
notifyOwner *Client

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feels like we should find a better solution for this.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in #772 (comment)

Comment thread mcp/client.go Outdated
},
&mcp.ClientOptions{},
&mcp.ClientOptions{
ToolListChangedHandler: func(ctx context.Context, _ *mcp.ToolListChangedRequest) {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this for supporting the MCP feature where an MCP server can add tools while holding the connection open? This feels orthogonal to what this PR is trying to accomplish. Maybe a separate PR? Not sure it's even worth supporting.

@nickmisasi nickmisasi May 27, 2026

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was in the original spec

But I think we can kick support for this down the road now that we have #732 opened, at least end users will be able to refresh.

Removed in next commit

@nickmisasi nickmisasi requested a review from crspeller May 27, 2026 19:23

@cursor cursor Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stale comment

Security Review — No Findings

Reviewed the full diff (2564 additions, 474 deletions across 43 files) for vulnerabilities in channel data isolation, user identity handling, tool/MCP boundaries, prompt injection surface, file access control, search scoping, and secrets exposure. CI checks pass (31/32 passed, 1 skipped).

Security-positive changes in this PR:

  • Request context propagation: GetToolsForUser, ConnectToRemoteServers, ConnectToEmbeddedServerIfAvailable, and tool resolvers now accept context.Context from the authenticated request, replacing context.Background()/context.TODO() calls. MCP tool calls now respect request-scoped cancellation.
  • Tool namespacing: MCP tools are namespaced by server slug (e.g., jira__search), preventing cross-server tool name collisions. lookupTool bare-name fallback correctly rejects ambiguous matches (multiple servers sharing the same bare name).
  • Concurrency safety: Client.tools and UserClients.clients access is now mutex-protected with proper RLock/Lock patterns. Tools() returns maps.Clone() to prevent concurrent modification.
  • Request-scoped error handling: Plugin connection errors are no longer persisted on the cached UserClients (cloneMCPErrors + local appendMCPError), preventing transient failures from leaking across requests.

Areas examined with no issues found:

  • channelAnalysisToolBot (api/api_channel.go:179): Overrides the bot's AutoEnableNewMCPTools/EnabledMCPTools to load the full MCP catalog for tool discovery. Confirmed safe: channels/channels.go:96-111 immediately scopes to a new scopedTools store containing only read_channel and get_channel_info with bound channel_id parameters before the LLM executes. No other tools reach the ToolRunner.
  • NewPluginClient (mcp/client.go): User identity is injected via MMUserIDHeader from the session-authenticated userID (not user-supplied input). sourcePluginAPI nil check is preserved.
  • ToolPolicyLookupName / mcpToolAllowed: Bare-name fallbacks are always scoped by ServerOrigin, preventing cross-server policy confusion. LookupToolPolicy dispatches by serverBaseURL before stripping namespaces.
  • lookupTool / lookupUnloadedMCPTool: Both iterate all tools for bare-name matches but return not-found on ambiguity (multiple matches), preventing unintended tool resolution.
  • No hardcoded credentials, secrets in logs/responses, path traversal vectors, or SSRF vectors introduced.
Open in Web View Automation 

Sent by Cursor Automation: Find vulnerabilities

@crspeller crspeller left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just need to remove that context from the struct and I think this is good.

@nickmisasi nickmisasi requested a review from crspeller May 29, 2026 15:48
Base automatically changed from dynamic-mcp-01-config-store to master June 10, 2026 16:04
nickmisasi and others added 7 commits June 10, 2026 12:09
Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Removed upstream in PR #542 (5ab61bf) when OpenTelemetry replaced
log-based tracing. Restore master's no-arg NewToolStore signature
and drop TraceLog, TraceUnknown, TraceResolved, and
LogUnknownToolWarning while keeping the dynamic-MCP additions.
Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
@nickmisasi nickmisasi force-pushed the dynamic-mcp-02-client-catalog branch from c60dec5 to 272ed61 Compare June 10, 2026 16:19
Co-authored-by: Cursor <cursoragent@cursor.com>
@nickmisasi nickmisasi merged commit 644a1dd into master Jun 10, 2026
39 checks passed
@nickmisasi nickmisasi deleted the dynamic-mcp-02-client-catalog branch June 10, 2026 16:59

@cursor cursor Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Security Review — No Findings

Reviewed the full diff (2564 additions, 474 deletions across 46 files) for vulnerabilities in channel data isolation, user identity handling, tool/MCP boundaries, prompt injection surface, file access control, search scoping, and secrets exposure. CI checks pass (33/34, 1 skipped).

Security-positive changes in this PR:

  • Request context propagation: GetToolsForUser, ConnectToRemoteServers, ConnectToEmbeddedServerIfAvailable, and tool resolvers now accept context.Context from the authenticated request. The ToolResolver signature itself was updated to func(ctx context.Context, llmCtx *Context, argsGetter ToolArgumentGetter), eliminating context.Background() shortcuts across production tool-call paths. MCP tool calls now respect request-scoped cancellation and timeouts.
  • Concurrency safety: Client.tools access is now protected by toolsMu (RWMutex), and UserClients.clients by clientsMu. Tools() returns maps.Clone() to prevent concurrent map modification. ConnectToEmbeddedServerIfAvailable and ConnectToPluginServer use double-check-after-lock patterns to avoid duplicate connections.
  • Request-scoped error handling: cloneMCPErrors + local appendMCPError ensure plugin connection errors are request-scoped and no longer persisted on the cached UserClients, preventing transient failures from leaking across requests.
  • Tool namespacing: MCP tools are namespaced by server slug (e.g., jira__search), preventing cross-server tool name collisions. lookupTool bare-name fallback correctly rejects ambiguous matches (multiple servers sharing the same bare name return not-found).

Areas examined with no issues found:

  • channelAnalysisToolBot (api/api_channel.go:178): Creates a modified bot with AutoEnableNewMCPTools=true and EnabledMCPTools=nil to expand MCP catalog discovery. Confirmed safe: channels/channels.go:96-111 immediately replaces the tool store with a scopedTools store containing only read_channel and get_channel_info with bound channel_id parameters before the LLM executes. No other tools reach the ToolRunner.
  • lookupTool bare-name fallback (llm/tools.go:360-383): When the LLM calls a tool by bare name, the store searches all namespaced tools. Correctly returns not-found when ambiguous (multiple MCP servers share the same bare tool name). Built-in tools (empty ServerOrigin) are excluded from bare-name matching. The resolver always executes within the requesting user's permission scope.
  • ToolPolicyLookupName / LookupToolPolicy (mcp/tool_policy.go): Runtime namespaced names (e.g., jira__get_issue) are correctly stripped to bare names before matching admin-configured policies. An exact configured name still wins over bare-name fallback. Policy lookups are always scoped by serverBaseURL, preventing cross-server policy confusion.
  • shouldAutoExecuteTool / botChannelAutoEverywhereKeepTool: Both pass namespaced runtime names through the ToolPolicyChecker, which dispatches to LookupToolPolicyToolPolicyLookupName. Namespace stripping ensures namespaced tools correctly match their configured auto-run/auto-run-everywhere policies.
  • NewPluginClient (mcp/client.go:311-388): User identity is injected via MMUserIDHeader from userID (sourced from the authenticated session, not user-supplied input). sourcePluginAPI nil check is the first guard.
  • ToolRuntimeContext (llm/context.go): New runtime fields (KeepMCPTool, PreloadedMCPTools, restoreMCPDynamicTools, telemetry hooks) are scaffolding for dynamic loading in later stack PRs. All fields are set internally during context building, never from user input.
  • No hardcoded credentials, secrets in logs/responses, path traversal vectors, or SSRF vectors introduced.
Open in Web View Automation 

Sent by Cursor Automation: Find vulnerabilities

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants